Adaptive linear quadratic control using policy iteration - American Control Conference, 1994
نویسنده
چکیده
In this paper we present stability and convergence results for Dynamic Programming-based reinforcement learning applied to Linear Quadratic Regulation (LQR). The specific algorithm we analyze is based on Q-learning and it is proven to converge to the optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. This is the first convergence result for DP-based reinforcement learning algorithms €or a continuous problem.
منابع مشابه
Adaptive Linear Quadratic Control Using Policy Iteration
In this paper we present stability and convergence results for Dynamic Programming-based reinforcement learning applied to Linear Quadratic Regulation (LQR). The spe-ciic algorithm we analyze is based on Q-learning and it is proven to converge to the optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. The performance of ...
متن کاملOptimization of Markov jump linear system with controlled jump probabilities of modes
The optimal control p roblem of Markov jump linear quadratic model with controlled jump probabilities of modes is investigated. Two kinds of mode control policies , open2loop control policy and close2loop control policy , are considered. By using policy iteration and performance potential concept , a sufficient condition for the optimal close2 loop control policy being better than the optimal o...
متن کاملOptimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics
In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information. Detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. The introduced reinforcement learning-based algorithms learn online the approximate solution...
متن کاملGreedy Adaptive Critics for LQR Problems: Convergence Proofs
A number of success stories have been told where reinforcement learning has been applied to problems in continuous state spaces using neural nets or other sorts of function approximators in the adaptive critics. However, the theoretical understanding of why and when these algorithms work is inadequate. This is clearly exempliied by the lack of convergence results for a number of important situa...
متن کاملOptimal Adaptive Control for a Class of Stochastic Systems - American Control Conference, Proceedings of the 1995
We study linear-quadratic adaptive tracking problems for a special class of stochastic systems expressed in the state-space form. This is a longstanding problem in the control of aircraft flying through atmospheric turbulence. Using an ELS-based algorithm and introducing dither in the control law we show that the resulting control achieves optimal cost in the limit, while simultaneously the unk...
متن کامل